A useful extension to PostgreSQL typically includes multiple SQL objects; for example, a new data type will require new functions, new operators, and probably new index operator classes. It is helpful to collect all these objects into a single package to simplify database management. PostgreSQL documentation
Our second goal is to allow new data types, new operators and new access methods to be included in the DBMS. Moreover, it is crucial that they be implementable by non-experts which means easy-to-use interfaces should be preserved for any code that will be written by a user.
M. Stonebraeker and L. Rowe, The design of Postgres, May 1986
create/alter/drop extension
(Dimitri Fontaine, Tom
Lane)create extension
(Petr Jelínek)Subsequent characters in an identifier or key word can be letters, underscores, digits (0-9), or dollar signs ($).PostgreSQL Documentation
create temporary table allRecipes (html text);
copy allRecipes from program 'curl -s
"{url}"
| hxnormalize -x | hxselect -i
"{tag selector}" | tr "\n" " "';
tr
to remove newlineslaetitia=# select * from allrecipes ;
html
----------------------------------------------
{...}
(1 row)
with rawHtml(parts) as (
/* Let's get the url links at the begining */
select regexp_split_to_table(html, 'href="')
from allRecipes
)
/* Let's stop at the end of the link's url */
select split_part(parts, '/">', 1)
from rawHtml
/* The first part does not contain a real link */
where rawHtml.parts !~ '^{html selector}'
;
regexp_split_to_table
split_parts
split_part
-----------------------------------------------------
{url}/213251/liege-belgian-waffles-with-pearl-sugar
{url}/17681/belgian-waffles
{url}/88309/whole-grain-waffles
{url}/20513/classic-waffles
{url}/22180/waffles-i
{url}/246198/easy-french-toast-waffles
{url}/220890/potato-waffles
{url}/268309/cream-puff-crack-buns-choux-au-craquelin
{url}/246917/sams-sourdough-waffles
{url}/240386/cinnamon-roll-waffles
{url}/233920/tender-and-easy-buttermilk-waffles
{url}/171978/great-easy-waffles
{url}/261650/over-the-top-smores-waffles
{url}/237196/whole-wheat-coconut-oil-waffles
Let's look at the first link's html source code!
/((((((\\\\
=======((((((((((\\\\\
(( \\\\\\\
( (* _/ \\\\\\\
\ / \ \\\\\\________________
| | | </ ((\\\\
o_| / / \ \\\\ \\\\\\\
| ._ ( \ \\\\\\\\\\\\\\\\
| / / / \\\\\\\ \\
.______/\/ / / / \\\
/ __.____/ _/ ________( /\
/ / / ________/`_________' \ / \_
/ / \ \ \ \ \_ \
( < \ \ > / \ \
\/ \\_ / / > )
\_| / / / /
_// _//
/_| /_|
Let's look at the first link's html source code!
Let's get this json and parse it!
create temporary table processedRecipe(
data text,
recipe jsonb,
title text generated always as
((recipe -> 'name')::text) stored,
{...}
);
Let's get this json and parse it!
{...}
image text generated always as (
((recipe -> 'image')::jsonb -> 'url')::text
) stored,
description text generated always as (
(recipe -> 'description')::text
) stored,
yield integer generated always as (
regexp_replace((
regexp_split_to_array(
(recipe -> 'recipeYield')::text,' '))[1],
'"',
'')::integer
) stored,
{...}
->
operatorLet's get this json and parse it!
{...}
ingredients jsonb generated always as (
recipe -> 'recipeIngredient') stored,
steps jsonb generated always as (
recipe -> 'recipeInstructions') stored,
keywords text[],
rating double precision generated always as (
((recipe -> 'aggregateRating')::jsonb
-> 'ratingValue')::float
) stored
);
jsonb
datatype instead of
json
for performance reasonsLet's get this json and parse it!
copy processedRecipe (data)
from program
'curl -s "{url}" |
hxnormalize -x |
hxselect -ci "script[type=application\/ld\+json]" |
tr -d "\n" |
sed "s=\\\\=\\\\\\\\=g"';
We now have a huge table of jsons inside the "data" column!
Sadly, this is not a valid json array for Postgres, so
I was not able to use a jsonb[]
datatype for
that column
Let's get this json and parse it!
update processedRecipe set recipe = (regexp_match(
data,
'{ *"@context": "http://schema.org", *"@type": "Recipe",.*}')
)[1]::jsonb,
keywords = array['belgium', 'waffle'];
And we're good!
Let's get this json and parse it!
select title,
description,
yield,
ingredients,
steps,
keywords,
rating
from processedRecipe;
--------------------------------------------------------------------------------------------------------------------------------------------------------
title | "Liege Belgian Waffles with Pearl Sugar"
description | "These are sweeter than traditional Brussels Belgian waffles - a sweeter waffle that can be served for breakfast or dessert. Serve waffles warm or cooled with whipped cream and berries or drizzled with warm, melted chocolate."
yield | 10
ingredients | ["1 (.25 ounce) package active dry yeast", "1 ½ tablespoons white sugar", "¾ cup lukewarm milk", "3 eggs", "1 cup melted butter", "2 teaspoons vanilla extract", "3 cups flour", "½ teaspoon salt", "1 ½ cups pearl sugar (such as Lars' Own®)"]
steps | [{"text": "Sprinkle the yeast and white sugar over warm milk in a small bowl. The milk should be no more than 100 degrees F (40 degrees C). Let stand for 15 minutes until the yeast softens and begins to form a creamy foam.\n", "@type": "HowToStep"}, {"text": "Whisk the eggs, melted butter, and vanilla extract into the yeast mixture until evenly blended; set aside. Stir together the flour and salt in a separate large bowl, and make a well in the center. Pour the egg mixture into the well, then stir in the flour mixture until a soft dough forms. Cover with a light cloth and let rise in a warm place (80 to 95 degrees F (27 to 35 degrees C)) until doubled in volume, about 30 minutes. Gently mix in the pearl sugar.\n", "@type": "HowToStep"}, {"text": "Preheat a waffle iron according to manufacturer's instructions.\n", "@type": "HowToStep"}, {"text": "Place a baseball-size ball of dough on the preheated waffle iron. Cook waffles until golden and crisp, about 2 minutes. Repeat with remaining dough. Allow waffles to cool for 2 to 3 minutes before serving.\n", "@type": "HowToStep"}]
keywords | {belgium,waffle}
rating | 4.738255033557047
create table ingredient (
id integer generated always as identity primary key,
name text unique not null
);
create table recipe (
id integer generated always as identity primary key,
title text unique not null,
picture text,
description text,
keywords text[] not null,
rating float
);
create table step (
recipeId integer not null references recipe (id),
ordinality integer not null,
description text not null,
primary key (recipeId, ordinality)
);
create table ingredientInRecipe (
recipeId integer not null references recipe (id),
yield integer not null,
ingredientId integer not null references ingredient (id),
quantity decimal(5,3) not null,
unit text not null
);
<0,N>-<0,N>
relations lead to
a new table
returning
clause with insertIntoRecipe(id) as (
insert into pgwaffles.recipe(title,
picture,
description,
keywords,
rating)
(select title,
image,
description,
keywords,
rating
from processedRecipe)
on conflict (title) do update set title = excluded.title
returning id
),
ingredients (data) as (
select
/* we'll remove parentethis and what's inside them as it can cause
* problems and those details shouldn't be necessary */
regexp_replace(jsonb_array_elements_text(ingredients),
' *\([^\)]+\) *',
' '
)
from processedRecipe
),
processedIngredients(data) as (
select
regexp_split_to_array (
/* we'll change those horrible fractions into real numbers */
case
when data ~ ' ½' then regexp_replace(data, ' ½','.5')
when data ~ '½' then regexp_replace(data, '½','0.5')
when data ~ ' ⅓' then regexp_replace(data, ' ⅓','.3')
when data ~ '⅓' then regexp_replace(data, '⅓','0.3')
when data ~ ' ⅕' then regexp_replace(data, ' ⅕','.2')
when data ~ '⅕' then regexp_replace(data, '⅕','0.2')
when data ~ ' ⅙' then regexp_replace(data, ' ⅙','.17')
when data ~ '⅙' then regexp_replace(data, '⅙','0.17')
when data ~ ' ⅛' then regexp_replace(data, ' ⅛','.125')
when data ~ '⅛' then regexp_replace(data, '⅛','0.125')
when data ~ ' ⅔' then regexp_replace(data, ' ⅔','.7')
when data ~ '⅔' then regexp_replace(data, '⅔','0.7')
when data ~ ' ⅖' then regexp_replace(data, ' ⅖','.4')
when data ~ '⅖' then regexp_replace(data, '⅖','0.4')
when data ~ ' ⅚' then regexp_replace(data, ' ⅚','.8')
when data ~ '⅚' then regexp_replace(data, '⅚','0.8')
when data ~ ' ⅜' then regexp_replace(data, ' ⅜','.375')
when data ~ '⅜' then regexp_replace(data, '⅜','0.375')
when data ~ ' ¾' then regexp_replace(data, ' ¾','.75')
when data ~ '¾' then regexp_replace(data, '¾','0.75')
when data ~ ' ⅗' then regexp_replace(data, ' ⅗','.6')
when data ~ '⅗' then regexp_replace(data, '⅗','0.6')
when data ~ ' ⅝' then regexp_replace(data, ' ⅝','.625')
when data ~ '⅝' then regexp_replace(data, '⅝','0.625')
when data ~ ' ⅞' then regexp_replace(data, ' ⅞','.875')
when data ~ ' ⅞' then regexp_replace(data, '⅞','.875')
when data ~ '⅘' then regexp_replace(data, ' ⅘','.8')
when data ~ ' ⅘' then regexp_replace(data, '⅘','0.8')
when data ~ '¼' then regexp_replace(data, ' ¼','.25')
when data ~ ' ¼' then regexp_replace(data, '¼','0.25')
when data ~ '⅐' then regexp_replace(data, ' ⅐','.14')
when data ~ ' ⅐' then regexp_replace(data, '⅐','0.14')
when data ~ '⅑' then regexp_replace(data, ' ⅑','.11')
when data ~ ' ⅑' then regexp_replace(data, '⅑','0.11')
when data ~ '⅒' then regexp_replace(data, ' ⅒','.1')
when data ~ ' ⅒' then regexp_replace(data, '⅒','0.1')
else data
end,
' ')
from ingredients
),
insertIntoIngredient(id) as (
insert into pgwaffles.ingredient (name)
(select lower(array_to_string(array_remove(array_remove(data,data[1]),data[2]),' '))
from processedIngredients
)
on conflict (name) do update set name = excluded.name
returning id, name
),
insertIntoIngredientInRecipe(id) as (
insert into pgwaffles.ingredientInRecipe(
ingredientId,
recipeId,
quantity,
yield,
unit
)
(select insertIntoIngredient.id,
insertIntoRecipe.id,
processedIngredients.data[1]::decimal(5,3) as quantity,
processedRecipe.yield,
processedIngredients.data[2] as unit
from insertIntoIngredient
inner join processedIngredients
on insertIntoIngredient.name = lower(array_to_string(array_remove(array_remove(processedIngredients.data,processedIngredients.data[1]),data[2]),' ')),
insertIntoRecipe,
processedRecipe
)
returning recipeId
)
insert into pgwaffles.step(recipeId,
ordinality,
description)
(select distinct insertIntoIngredientInRecipe.id,
steps_with_ordinality.ordinality,
steps_with_ordinality."text"
from processedRecipe,
insertIntoIngredientInRecipe,
rows from (jsonb_to_recordset(steps) as ("text" text,"@type" text))
with ordinality steps_with_ordinality)
;
with bestRecipe(recipeId) as (
select id
from recipe
order by rating
limit 1
)
select
ingredientInRecipe.yield,
ingredientInRecipe.quantity,
ingredientInRecipe.unit,
ingredient.name
from ingredientInRecipe
inner join bestRecipe
on ingredientInRecipe.recipeId = bestRecipe.recipeId
inner join ingredient
on ingredientInRecipe.ingredientId = ingredient.id
yield | quantity | unit | name
-------+----------+-------------+------------------
10 | 3.000 | eggs |
10 | 1.000 | package | active dry yeast
10 | 3.000 | cups | flour
10 | 0.750 | cup | lukewarm milk
10 | 1.000 | cup | melted butter
10 | 1.500 | cups | pearl sugar
10 | 0.500 | teaspoon | salt
10 | 2.000 | teaspoons | vanilla extract
10 | 1.500 | tablespoons | white sugar
(9 rows)
with bestRecipe(recipeId) as (
select id
from recipe
order by rating
limit 1
)
select description
from step
inner join bestRecipe
on step.recipeId = bestRecipe.recipeId
order by ordinality
-[ RECORD 1 ]----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
description | Sprinkle the yeast and white sugar over warm milk in a small bowl. The milk should be no more than 100 degrees F (40 degrees C). Let stand for 15 minutes until the yeast softens and begins to form a creamy foam. +
|
-[ RECORD 2 ]----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
description | Whisk the eggs, melted butter, and vanilla extract into the yeast mixture until evenly blended; set aside. Stir together the flour and salt in a separate large bowl, and make a well in the center. Pour the egg mixture into the well, then stir in the flour mixture until a soft dough forms. Cover with a light cloth and let rise in a warm place (80 to 95 degrees F (27 to 35 degrees C)) until doubled in volume, about 30 minutes. Gently mix in the pearl sugar.+
|
-[ RECORD 3 ]----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
description | Preheat a waffle iron according to manufacturer's instructions. +
|
-[ RECORD 4 ]----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
description | Place a baseball-size ball of dough on the preheated waffle iron. Cook waffles until golden and crisp, about 2 minutes. Repeat with remaining dough. Allow waffles to cool for 2 to 3 minutes before serving. +
|
Put all your sql code inside a file named after your extension with versioning
pgwaffles--1.sql
--complain if script is sourced in psql,
--rather than via CREATE EXTENSION
\echo Use "CREATE EXTENSION pgwaffles" to load this file.
\quit
EXTENSION = pgwaffles
DATA = pgwaffles--1.sql
PG_CONFIG = pg_config
PGXS := $(shell $(PG_CONFIG) --pgxs)
include $(PGXS)
The makefile will need a control file
# postgreswaffles extension
# comment = 'Displays the best waffles recipe regarding allrecipes.com'
default_version = '1'
module_pathname = '$libdir/pgwaffles'
relocatable = false
Installing the extension
sudo make install
/bin/sh /usr/local/pgsql/lib/pgxs/src/makefiles/../../config/install-sh -c -d '/usr/local/pgsql/share/extension'
/bin/sh /usr/local/pgsql/lib/pgxs/src/makefiles/../../config/install-sh -c -d '/usr/local/pgsql/share/extension'
/usr/bin/install -c -m 644 .//pgwaffles.control '/usr/local/pgsql/share/extension/'
/usr/bin/install -c -m 644 .//pgwaffles--1.sql '/usr/local/pgsql/share/extension/'
laetitia=# create extension pgwaffles;
CREATE EXTENSION
laetitia=# \dn
List of schemas
Name | Owner
-----------+----------
pgwaffles | laetitia
public | postgres
(2 rows)
laetitia=# \dt pgwaffles.*
List of relations
Schema | Name | Type | Owner
-----------+--------------------+-------+----------
pgwaffles | ingredient | table | laetitia
pgwaffles | ingredientinrecipe | table | laetitia
pgwaffles | recipe | table | laetitia
pgwaffles | step | table | laetitia
(4 rows)
laetitia=# \df pgwaffles.*
List of functions
Schema | Name | Result data type | Argument data types | Type
-----------+--------------------+------------------+-----------------------------------------------------------------------+------
pgwaffles | displayingredients | SETOF record | OUT yield integer, OUT quantity numeric, OUT unit text, OUT name text | func
pgwaffles | displayrecipe | SETOF text | | func
(2 rows)
laetitia=# select * from pgwaffles.displayingredients();
yield | quantity | unit | name
-------+----------+-------------+------------------
10 | 3.000 | eggs |
10 | 1.000 | package | active dry yeast
10 | 3.000 | cups | flour
10 | 0.750 | cup | lukewarm milk
10 | 1.000 | cup | melted butter
10 | 1.500 | cups | pearl sugar
10 | 0.500 | teaspoon | salt
10 | 2.000 | teaspoons | vanilla extract
10 | 1.500 | tablespoons | white sugar
(9 rows)
-[ RECORD 1 ]-+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
displayrecipe | Sprinkle the yeast and white sugar over warm milk in a small bowl. The milk should be no more than 100 degrees F (40 degrees C). Let stand for 15 minutes until the yeast softens and begins to form a creamy foam. +
|
-[ RECORD 2 ]-+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
displayrecipe | Whisk the eggs, melted butter, and vanilla extract into the yeast mixture until evenly blended; set aside. Stir together the flour and salt in a separate large bowl, and make a well in the center. Pour the egg mixture into the well, then stir in the flour mixture until a soft dough forms. Cover with a light cloth and let rise in a warm place (80 to 95 degrees F (27 to 35 degrees C)) until doubled in volume, about 30 minutes. Gently mix in the pearl sugar.+
|
-[ RECORD 3 ]-+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
displayrecipe | Preheat a waffle iron according to manufacturer's instructions. +
|
-[ RECORD 4 ]-+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
displayrecipe | Place a baseball-size ball of dough on the preheated waffle iron. Cook waffles until golden and crisp, about 2 minutes. Repeat with remaining dough. Allow waffles to cool for 2 to 3 minutes before serving. +
|
git archive --format zip --output pgwaffles-1.0.0.zip master